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WO 99/61637 PCT/CA99/00479 
GENE SEQUENCES OF RUBELLA VIRUS ASSOCIATED WITH ATTENUATION 



Background of the Invention 



Rubella virus is the causative agent of German 
measles, a viral infection associated with a mild fever and 
5 rash. The most serious complications of rubella occur 
during pregnancy due to transplacental passage of the virus 
to the fetus resulting in the widespread manifestations of 
congenital rubella. These include fetal loss, or 

multisystem defects in the newborn such as cataracts, 
10 deafness, cardiac abnormalities and microcephaly. 

To prevent congenital infection, a universal 
vaccination scheme for all children around 15 months of age 
was implemented in North America in 1969, using attenuated 
vaccines which had recently been developed. While reducing 

15 the level of rubella circulating in the community, 
vaccination of young children did not significantly alter 
the proportion of women entering their childbearing years 
without protective levels of circulating antibody 
reported to be around 10-15%. This population was 

20 therefore also targeted for vaccination. 

Vaccination reduced the incidence of congenital 
rubella but was found to be associated with a number of 
sequelae, particularly in women over 25 years of age. 
Symptoms included arthritis, neurological manifestations 

25 and chronic fatigue. The most notable complication of 
rubella immunisation was arthritis which has also 
frequently been documented as a consequence of natural 
rubella. The joint symptoms induced can be severe in the 
acute stage but usually resolve without causing permanent 

30 joint damage. Occasionally, however, chronic or recurrent 
arthritis develops which can persist for many months or 
years in certain individuals (Ford et al . , 1988) 

Several vaccines have been used in North America since 
1969. These include two variants of the HPV77 strain 
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originally produced by Dr. H. Meyer from the wild strain 
M2 3 by multiple passages in monkey kidney cells (Meyer 
et ai . t 1363) . The HPV7 7 strain was further attenuated by 
a further 5 passages in duck embryo ceils (to give the 

5 KPV77/DE5 strain) or by 12 passages in dog kidney cells itc 
give tne HPV77/DK12 strain) . The HPV77/DK12 vaccine proved 
to be too reactcgenic even in children and was soon removed 
from distribution. The HPV77/DE5 vaccine was used as part 
of the M-M-RI vaccine (neasies /mumps/rubella combined 

0 vaccine; Merck Sharp & Dohme ; West Point, Pa. U.S.A.) until 
1579 when it was replaced in the M-M-RII vaccine by the 
RA27/3 strain (Piorkin and Buser, 1985) , which is the 
current vaccine strain used in North America. 

The Cendehill strain ( Peetermans & Huygelen, 1967) was 
5 developed in Belgium and was the predominant strain used in 
vaccine production in Europe until 1989. The Cendehill 
strain is reported to be associated with a decreased 
incidence of complications in the adult female population 
in a comparative study of five vaccines. Best et al . 
0 (1974) reported that acute arthritis occurred in only 3% of 
individuals immunised with Cendehill vaccine but in 17% of 
those receiving RA27/3. Moreover the symptoms with RA27/3 
were also more prolonged. The disadvantage of the 
Cendehill vaccine was that the mean titre of HA I antibody 
5 induced in vaccine recipients was lower than that obtained 
with the RA27/3 strain indicating that Cendehill is less 
immunogenic . 

A close correlation has been found between the ability 
of a given strain of rubella virus to infect and persist in 

0 human joint tissue in culture and its association with the 
induction of arthropathy in vivo , suggesting that tropism 
for joint tissue is an important determinant of the ability 
to induce joint symptoms ( arthritogenicity ) . As reported 
in Miki and Chantler (1992), wild-type strains (Therien and 

5 M33) were found to grow to high titres of 1C 6 -10 7 pfu/ml in 
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the medium of either cell cultures or organ cultures 
derived from human joint tissue. In contrast the RA27/3 
strain was considerably restricted for growth giving yields 
of 10 3 -10 4 pfu/ml and the Cendehili strain showed no growth 
5 at all. These results correlate with the known 

associations of rubella strains and joint symptoms in vivo . 



Rubella virus is a small (60-70 nm) enveloped 
togavirus, the sole member of the genus Rubi virus. It has 
a single-stranded RNA genome approximately lOkb in size. 

10 The genomic RNA is posi tive - stranded which means that it 
can act as mRNA within the infected cell. The sequence of 
the entire genome has been determined for two wild-type 
strains Therien and M33 (Dominguez et al . , 1990; Gillam 
et al . , 1993, Genbank No. X72393), and the RA27/3 vaccine 

15 strain (Pugachev et al. , 1997) . The genome contains two 
large open-reading frames (ORF's) which code for the 
structural proteins (3' proximal 3189 nucleotides) and 
non-structural proteins (5' proximal 6345 nucleotides) . 
The current understanding is that the open-reading frames 

20 for the structural and the non- structural proteins are 
separated by a region of about 123 nucleotides. 

The infected cell contains two virus - induced 
positive-strand RNA species, the genomic RNA (40s ; lOkb) 
and a sub-genomic mRNA (26s; 3kb) which encodes the major 

25 ORF for the structural proteins. The ORF for structural 
proteins is translated into a HOkd polyprotein and is 
subsequently cleaved by cellular signal peptidase into the 
three structural viral proteins, El, E2 , and C. The order 
of structural genes was originally determined by 

30 synchronised translation as being NH 2 -C-E2-El-COOH, which 
was confirmed by sequence analysis of cDNA clones of the 
subgenomic mRNA (Clarke et al . , 1987 ; Frey & Marr, 1988 ; 
and Zheng et al . , 1989) . 
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The non- structural (NS) genes are translated from the 
full-length genomic RNA as a >200kD polyprotein which is 
subsequently cleaved into two non- structural proteins, p!50 
and p9C. These comprise the enzymes required for viral 
5 replication in the cell. Protein pl50, nearest the 
5' terminus, is 13CC amino-acids in length and encodes the 
putative methyltransf erase function and the viral protease. 
Protein p9C is 905 anino-acids long and has regions of 
homology with global helicase and replicase domains. 



IC Summary of the Invention 



This invention provides nucleic acids (DNA or RNA) 
comprising one or more sequences of nucleotides 
corresponding to all or part of the genome of the Cendehill 
strain of rubella virus. Nucleic acids of this invention 

15 may encode an infectious virus of the Cendehill strain or 
one having an attenuated phenotype equivalent to Cendehill 
strain. DNA of this invention may be in a plasmid or viral 
vector which enables replication and/or transcription of 
the Cendehill cDNA and is referred to herein as a Cendehill 

20 infectious clone. The infectious clone may be used to 
produce a DNA vaccine for rubella virus. 

This invention also provides a nucleic acid (DNA or 
RNA) comprising a sequence of nucleotides that includes a 
first portion corresponding to one or more of the 

25 non- translated regions, pl50, p90, C, E2 and El gene 
regions of Cendehill strain and a second portion that is 
derived from another rubella virus strain such that the 
product encodes a novel infectious chimeric rubella virus 
strain. DNA of this invention may be in a plasmid or viral 

30 vector forming an infectious clone. 



This invention also provides a chimeric 
Cendehiil/RA27/3 clone whose genome includes a first 
portion corresponding to the Cendehill 5' non- translated 
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RNA, Cendehill pl50 and p90 and wherein a second portion 
corresponds to the structural gene region and the 3' 
non-translated region of RA27/3 strain. This clone can be 
used to produce a chimeric virus that expresses the 
5 structural proteins of RA27/3 but has the genetic structure 
at the 5 ' end and in the non-structural genes of Cendehill 
strain that determine the non- arthrotropic nature of this 
strain . 



This invention also provides RNA encoding the entire 
1C genome of Cendehill or the Cendehill/RA27/3 chimera or a 
fragment thereof, by transcribing the aforementioned DNA . 
This invention also provides rubella virus produced by 
transcribing the DNA, transf ecting cells with the RNA so 
derived, and recovering virus from cells so transf ected. 



15 This invention also provides a nucleic acid encoding 

one or more Cendehill strain rubella virus proteins 
selected from the group consisting of: pl50, p90, C, El and 
E2 , or wherein the nucleic acid corresponds to a 
non- translated region of the Cendehill genome. The nucleic 

2 0 acid may be DNA or RNA and may be incorporated into a 
plasmid or viral vector for expression of protein. 

This invention also provides a method of producing 
Cendehill viral protein comprising the steps of expressing 
a DNA sequence encoding a protein corresponding to 

25 Cendehill protein pl50, p90, C, E2 or El in a cell by means 
of a suitable expression vector and recovering the protein 
so expressed. The protein may be a Cendehill protein 
having a sequence corresponding to a portion of the cDNA 
sequence in Appendix 1 or the protein may be altered by 

30 modification of the Cendehill cDNA, as described herein. 

This invention also provides a method of producing a 
recombinant DNA encoding a mutated or chimeric rubella 
virus exhibiting the lack of arthrot ropici ty of the 
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Cendehill strair. but with additional advantageous 
properties that include, but are not restricted tc, 
increased immunogenicity or stability of another rubella 
strain . 

This method comprises steps whereby; 

(a) nucleotides in Cendehill cDNA encoding viral 
structural proteins are altered such that the protein so 
encoded increases the immunogenicity or stability of a 
recombinant rubella virus comprising said protein; or 

(b) nucleotides in the non- translated regions or 
non- structural gene region of cDNA for rubella virus other 
than Cendehill are altered to decrease arthri togenici ty of 
a recombinant rubella virus coded for by the altered cDNA. 

cDNA from steps (a) or (b) , may be incorporated into a 
plasmid or viral vector to produce an infectious clone, 
from which RNA may be transcribed and transfected into 
cells to provide virus that may be used as a recombinant 
rubella vaccine. Alternatively, cDNA from (a) or (b) in a 
suitable vector may be used as a DNA vaccine. 

This invention also provides a rubella virus whose 
genetic material comprises a first portion corresponding to 
one or more RNA sequences selected from the group 
consisting of: Cendehill non- translated RNA, Cendehill 
pl50, p90, C, El and E2 RNA; and wherein a second portion 
of the genome corresponds to RNA of a rubella virus other 
than Cendehill. 

This invention also provides a Cendehill viral protein 
free of virus, selected from the group consisting of: pl50, 
p90, C, El and E2 , produced by expressing Cendehill cDNA 
encoding said protein from an expression vector. 
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This invention also provides rubella cDNA , RNA or a 
rubella virus having one or more of the Cendehill 
strain-specific nucleotides selected from a group 
consisting of: 37-C, 55-G, 118-T(or U) , 358-C, 2829-A, 
5 3060-G, 3164-C, 3528-T (or U) , 4530-T (or U) , 6611-C, 6770- 
G , 6771-G, 7428-T (or U) , 8786-G, 8788-T (or U) , 8864-A, 
9180-T (or U) , 9254-A, and 9741-T (or U) . The aforesaid 
nucleotide numbers are in reference to nucleotides bearing 
the same numbers as shown in Appendix 1 for Cendehill. 

10 cDNA, RNA or virus of this invention may have the 
strain-specific nucleotide at a different nucleotide 
position number as compared to Cendehill, providing the 
context of the strain- specif ic nucleotide is the same as 
for Cendehill. In this instance, context defines the five 

15 nucleotides on either side of the strain-specific 
nucleotide in Cendehill. 



This invention also provides a Cendehill cDNA, and 
genomic RNA that encodes a rubella virus protein selected 
from the group of proteins pl50, p90, C, El and E2 and with 

20 one or more Cendehill strain-specific amino-acids defined 
as pl50/929/tyr, pi 5 0 / 1 0 0 6 /gly , pl50/1041/his , 
P 150/ll62/val, p90/l4 96/ile, C/ 34 /pro, C/87/gly, 
E2/3 0 6/val / £2/413/116, El/75 9 /asp, El/78 5/met/ , 

El/890/leu, and El/915/thr. The aforesaid strain- specif ic 

25 amino acids are identified by protein name, amino-acid 
position within the Cendehill rubella polyprotein, and the 
identity of an amino-acid at such a position. Such 
proteins of this invention include proteins having the 
strain-specific amino acid at a different amino acid 

3 0 position number in the protein as compared to Cendehill 
providing the context of the strain- specif ic amino acid is 
the same as for Cendehill. In this instance, context is 
defined as including the three amino acids to either side 
of the strain-specific amino acid in Cendehill. In this 

35 specification, reference to a strain- specif ic amino acid 
such as pl50/929/tyr will be used to identify the amino 
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acid as well as a protein {eg. plSC' containing the 
strain-specific amine acid, in context as described herein. 

This invention also provides a nucleic acid (eg. DNA) 
for the first 5' non- translated region (NTR) and first stem 
5 loop (nucleotides 1 to 65) equivalent to that found in the 
Cendehill strain and characterised as being a major 
determinant cf growth restriction in joint tissue. 
Specific characteristics cf this stem loop in Cendehill 
include two nucleotide changes from the wild-type Therien 

10 strain, a U to C at nucleotide 37 in the predicted terminal 
lcop that alters the size cf the loop from 6 to 
11 nucleotides, and an A to G at nucleotide 55 that 
increases the size cf the predicted medial loop from 
6-10 nucleotides. These two nucleotide changes at these 

15 positions and in the context found in Cendehill strain 
(defining the five nucleotides on either side of each 
nucleotide) are determinants of arthrot ropism . Other 
mutations between nucleotides 20-28 and 52-60 that either 
increase or decrease the predicted size of the medial loop 

20 are included within the scope of this invention. Similarly 
any mutation that alters the predicted size of the terminal 
loop and alters the phenotypic characteristics of the virus 
are within the scope of this invention. Factors that 
define the determinants of joint cell restriction include 

25 sequence-specific changes in the medial or terminal loop or 
changes that alter the size of either or both of the loops. 
These regions include nucleotides 20-28, 33-43 and 52-60. 

Appendix 1 sets out the sequence of cDNA representing 
the Cendehill genome. Location of the various 

3 0 non- translated regions and coding regions are shown. Two 
poiyproteins are encoded, beginning at the start codons 
indicated for p!50 and the C protein, respectively. The 
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amino acid sequence of each polyprotem and the respective 
structural and non- structural proteins may be determined 
from the nucleotide sequence of Appendix 1. In this 
specification, the location of an amino-acid will be given 
5 by reference to a residue number of a polyprotein, which 
residue number may be determined directly from the series 
of codons shown in Appendix 1 commencing at one or the 
other of the start codons . 



The term "corresponding" as used in this specification 

10 means that when a nucleic acid, peptide or protein is 
described by reference to a specified nucleic acid, peptide 
or protein, the nucleic acid, peptide or protein so 
described may include a nucleotide or amino acid sequence 
which differs from the sequence of the specified nucleic 

15 acid, peptide or protein. Corresponding nucleic acids, 
polypeptides or proteins will include sequences of 
differing length or which differ by one or more 
substitutions, additions or deletions. Nucleic acids, 
peptides and proteins of this invention include fragments 

20 of specified nucleic acids, peptides or proteins and may 
include additional amino acid or nucleotide sequences from 
that specified. Furthermore, corresponding nucleic acids 
include complementary nucleic acids, meaning those nucleic 
acids capable of base pairing with a specified nucleic 

25 acid. Nucleic acids having sequences which differ from the 
sequence of a specified nucleic acid due to degeneracy of 
the genetic code are also included within the meaning of 
the term corresponding . Further, nucleic acids which 
encode peptides cr proteins in which there are conservative 

30 substitutions, additions or deletions as compared to a 
specified peptide or protein are included. Any and all 
such nucleotide variations and resulting amino acid 
polymorphisms which provide the advantages of this 
invention as described herein are within the scope of this 

35 invention. 
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Nucleic acids within the scope of this invention may 
contain linkers, modified or unmodified restriction 
endonuciease sites and other sequences of nucleotides 
useful for cloning, expression, or purification. Nucleic 
j acids within the scope of this invention may be 
incorporated in a larger sequence of nucleotides, including 
piasmids and vectors useful for manipulation or expression 
of nucleic acids. 



One measure of "correspondence" of nucleic acids, 
0 peptides or proteins with respect to this invention is 
relative "identity" between sequences. In the case of 
peptides or proteins, or in the case of nucleic acids 
defined according to a encoded peptide or protein 
correspondence includes a peptide having at least about 50% 
5 identity, more preferably at least about 70% identity, even 
more preferably at least about 90% identity, even more 
preferably at least about 95% and most preferably at least 
about 98-99% identity to a specified peptide or protein. 
Preferred measures of identity as between nucleic acids is 
0 the same as specified above for peptides with at least 
about 90% or at least about 98-99% identity being more or 
most preferable. 

The term "identity" as used herein refers to the 
measure of identity of sequence between two peptides or 

5 between two nucleic acid molecules. Identity can be 
determined by comparing a position in each sequence which 
may be a line for purposes of comparison. Two amino acid 
or nucleic acid sequences are considered substantially 
identical if they share at least about 75% sequence 

0 identity, preferably at least about 90% sequence identity, 
even more preferably at least 95% sequence identity and 
most preferably at least about 98-99% identity. 



algo 



Sequence identity may be determined by 
rithm described in Altschul et al . (1990) J . 



the BLAST 
Mcl. Bicl. 
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215:403-410, using the published default settings. When a 
position in the compared sequence is occupied by the same 
base or amino acid, the molecules are considered to have 
shared identity at that position. The degree of identity 
5 between sequences is a function of the number of matching 
positions shared by the sequences. 

An alternate measure of identity of nucleic acid 
sequences is to determine whether two sequences hybridize 
to each other under low stringency, and preferably high 

10 stringency conditions. Such sequences are substantially 
identical when they will hybridize under high stringency 
conditions. Hybridization to filter-bound sequences under 
low stringency conditions may, for example, be performed in 
0.5 M NaHP0 4/ 7% sodium dodecyl sulfate (SDS) , 1 mM EDTA at 

15 65°C, and washing in 0 . 2 x SSC/0.1 SDS at 42°C (see Ausubel 
et al . (eds.) 1989, Current Protocols in Molecular Biology , 
Vol. 1, Green Publishing Associates, Inc., and John Wiley 
Sc Sons, Inc., New York, at p. 2.10.3). Alternatively, 
hybridization to filter-bound sequences under high 

20 stringency conditions, may for example, be performed in 
0.5 M NaHP0 4 , 7% SDS, 1 mM EDTA at 65°C, and washing in 
0.1 x SSC/0.1% SDS at 68°C (see Ausubel et al . (eds), 1989, 
supra ) . Hybridization conditions may be modified in 
accordance with known methods depending on the sequence of 

25 interest (see Tijssen, 1993, Laboratory Techniques in 

Biochemistry and Molecular Biology -- Hybridi zation with 
Nucleic Acid Probes , Part I, Chapter 2 "Overview of 
Principles in Hybridization and the Strategy of Nucleic 
Acid Probe Assays", Elsevier, New York). Generally, 

30 stringent conditions are selected to be about 5°C lower 
than the thermal melting point for the specific sequence at 
a defined ionic strength and pH . 

Nucleic acids of this invention will preferably 
exhibit substantial identity to Cendehill, with respect to 
3 5 the regions of the Cendehill genome described herein which 
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relate to the arthrctrcpir phenctype of Cendehiil. Mere 
creferably, such regions will have at least about 98% 
identity. Most preferably, there will be complete identity 
in the "context" of Cendehiil st rain- speci f ic nucleotides 
cr amino acids, as "context" is described herein. 



With reference to nucleic acids corresponding to the 
first 5 ' NTR of Cendehiil, such correspondence may be 
determined by predicting the folded structure of the region 
rather than by measuring sequence identity. Nucleic acids 

10 cf this invention include a 5 ' NTR having a folded structure 
in which one or both of the terminal and medial loops is 
altered in size as compared to wild-type. The size of the 
loop may be quantified according to the number of un-paired 
bases in the loop region. Preferably, such alterations 

15 result in an increase in size of the loop as compared to 
wild-type. More preferably, such altered loops will be of 
at least the size of the terminal and medial loops 
described herein for Cendehiil. Most preferably, the 
sequence of un-paired bases in either loop region will be 

20 substantially the same as described herein for Cendehiil 
loops. Further, nucleic acids of this invention comprising 
a 5 ' NTR , may include a bulge which is increased in size as 
compared to the wild-type bulge and preferably will have at 
least four un-paired bases in a bulge to one side of the 

25 stem structure. Most preferably, the sequence of un-paired 
bases in such a bulge will be substantially as described 
herein for the Cendehiil bulge. Determination of predicted 
folding of a 5 ' NTR is carried out as described herein using 
the Mfold™ 3.0 program. 

30 Variation in the immunogenici ty , yield, stability cr 

pathogenicity of the product may readily be determined by 
standard techniques by comparison to known strains such as 
Cendehiil. For example, mutation of Cendehiil to increase 
antigenicity may be determined by measuring increased 

3 5 binding of a virus or viral protein to a known antibody to 
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rubella virus and cDmparing this binding tc that of 
Cendehill virus cr protein an an equivalent concentration. 

Arthrotropism, for the purpose of this specification, 
is defined as the ability of a rubella virus strain to 
5 replicate in pieces of human joint tissue weighing 
approximately 0.1 gram cultured in 2 mis of medium and 
yield virus of titres greater than 100 plaque - forming units 
per ml of medium, at 24 hours post - infect ion that increases 
over the next 24 tc- 4 8 hours. Any virus less than 

10 100 pfu/cell and that does not show an increase in titre 
represents residual virus from the inoculum. Following a 
period to allow adsorption of virus in the inoculum to the 
cells (4 hours) , the joint pieces are washed 4 to 5 times 
to reduce this residual virus and characteristically 

15 10-100 pfu/ml of virus remains after this procedure. 

This invention also provides a method for constructing 
chimeric rubella viral strains comprising part Cendehill 
and part of a second rubella strain including steps 
whereby : 

20 (a) cDNA for one or more of the Cendehill 

non-translated regions, non- structural proteins pl50 and 
p90 and structural proteins C, E2 and El is joined to cDNA 
of a rubella virus other than Cendehill to produce DNA 
corresponding to a complete RNA genome of a chimeric 

25 rubella virus. This may also be incorporated into a plasmid 
or viral vector to provide a chimeric infectious clone. 

(b) the resulting altered cDNA clone may be 
transcribed to produce RNA which may be used to transfect 
cells to produce chimeric virus, which can be cultivated as 
30 a seed stock for vaccine production. 

This invention also provides rubella cDNA, RNA, or 
virus wherein cDNA or RNA encoding one or more of the viral 
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pi 5 0 or p?0 pre t ems cr the cDKA or RNA ccr re spending :c a 
5' non- translated region is derived from or is mutated to 
correspond to Cendehill, and at least part of the DKA, RNA 
or viral RNA, is derived from c^r is mutated to correspond 
5 to rubella other than Cendehill. Preferably, the cDNA, RNA 
or genome of the virus will have one or mere substitutions 
cr deletions (as coxpared with Therien strain ) in or near 
the 5' non- translated region in the areas of nucleotides 
17-65; substitutions in the non- structural gene coding 
10 region resulting in one or more mutations of amino acids 
329, 1006, 1041, 1162 of pi50 protein or amino acid 1496 of 
p9C protein; or, substitutions at or near nucleotides 118 
or 3 58 of the non- structural gene encoding region. 

This invention also provides the use of the 
15 aforementioned cDNA, RNA , vectors (including infectious 
clones) and viruses (recombinant or chimeric) in the 
production of modified rubella cDNA, RNA or viruses, 
production of modified rubella protein, and in the 
production of rubella vaccines (DNA vaccines, live 
20 attenuated viral vaccines and subunit vaccines) . 

This invention also provides the entire sequence of 
the Cendehill strain of rubella virus, including the 
identification of nucleotide substitutions relative to 
wild-type strains which are unique to the Cendehill strain 

25 and are associated with the attenuating phenotype . This 
phenotype includes temperature sensitivity and the 
restriction of growth in human joint tissue. These 
substitutions can be incorporated into other rubella 
strains such as the current RA27/3 vaccine to produce new 

30 vaccine strains that are not arthritogenic . Such 
substitutions may be in the region of nucleotides 17-65 (in 
or near the first 5' non- translated region) which forms a 
stem-loop structure. The substitutions may be at or near 
nucleotides 118 cr 358 of the non- structural gene region, 

3 5 or the substitutions may involve one or more mutations of 
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amino acids 929, 1006, 1041, 1162 of pl50 or amino acid 
1496 of p90 . 



This invention also identifies mutations in Cendehill 
virus structural gene regions associated with reduced 
5 immunogenici ty of this strain. These include two amino 
acid substitutions in the E2 protein at amino acids 306 and 
413 (ie. at nucleotides 7428 or 7746/47), and four amino 
acid substitutions in El at amino acids 759, 785, 890 and 
915 (ie. at nucleotides 8786/88; 8864; 9180; or 9254). 
10 Alterations of some or all of these nucleotides to the 
equivalent nucleotides found in a more immunogenic strain 
such as RA27/3 or wild- type, enables production of a 
modified Cendehill strain which would be more antigenic. 
This may also be used as an alternative vaccine. 

15 The infectious clone of Cendehill strain exemplified 

herein and identified as pJCND, comprises a DNA copy of the 
full-length Cendehill viral genome inserted into a vector 
from which RNA transcripts of the genome can be synthesized 
in vitro and which transcripts are infectious when 

20 transfected into cells. In the case of pJCND, the vector 
is the plasmid pCL 1921, which was originally constructed 
by Lerner and Inouye (1990) but modified by incorporation 
of the pUC19 polycloning region (Yanisch- Perron et al . , 
1985) and an SP6 RNA polymerase promoter . This plasmid is 

25 replicated at low copy number (approximately 5 copies per 
cell) and contains a spect inomycin resistance gene. 
Transcription of pJCND or other infectious clones employing 
Cendehill cDNA with a suitable polymerase (eg. SP6 
polymerase for pJCND) enables the production of infectious 

30 Cendehill RNA which can be transfected into cells to yield 
a seed stock for obtaining recombinant rubella virus stocks 
and rubella vaccines. 

Methods for production of infectious clones, 
subsequent expression of RNA, transfection of cells with 
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such RNA and production cf virus as well as use cf such 
virus in the preparation of rubella vaccines are known, for 
example as described in United States Patent 5,439,814 and 
5,663,065 cf Frey, et al . Suitable expression vectors fcr 
5 rubella cDNA include these described herein as well as 
others known in the art such as the pSI or pCI mammalian 
expression systems (Promega) which incorporate the SV40 and 
CMV Immediate Early enhancer/promoter systems 
(respectively) or bacterial plasmids such as pUC19, pGEM or 
1C PBR-322 (Promega) incorporating a suitable promoter 
sequence such as the SP6 promoter. 

Methods for production of suitable expression vectors 
for use in DNA vaccines are also known. For example, cDNA 
derived from this invention may be expressed in pSI or pCI 

15 described above or the vector could be a viral vector 
modified to allow expression of foreign genes. Such 
vectors derived from adenovirus, retrovirus, alphavirus , or 
vaccinia virus are frequently modified to make them 
non-pathogenic to the host. Such vectors expressing cDNA 

20 derived from this invention may be used directly as a DNA 
vaccine . 

For preparation of chimeric strains according to this 
invention, a preferred method is to synthesize cDNA from a 
second rubella virus by preparing RNA from virus of the 

25 second strain using established techniques and then 
performing reverse transcription and PCR (polymerase chain 
reaction) on the isolated RNA using primers which flank the 
region cf interest (for example, primers FI or 18 as 
described herein for synthesis of the Cendehill/RA27/3 

3 0 chimera) . The cDNA is then subjected to restriction enzyme 
digestion and resulting fragments are ligated into the 
Cendehill infectious clone which has been similarly 
digested to remove the same segment. Similarly, desirable 
portions of the Cendehill cDNA (such as the non- translated 

35 region, or non- structural genes) may be obtained by 
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ng fragment ligated into an 
rubella strain which has been 



As exemplified herein, recombinant viruses were 
5 derived from pJCND and Therien/Cendehi 1 1 chimeras. These 
strains were compared for their ability to grow in primary 
human joint cells, enabling the identification of two 
regions associated with growth restriction in these cells, 
in the non- structural gene region. The identification of 
10 these regions enables the production of further recombinant 
virus strains which combine the phenotypic property of 
joint growth restriction with the immunogenic! ty of other 
rubella virus strain such as RA27/3, M33 or Therien. 



Sequencing of pJCND enabled the identification of 
15 nucleotide substitutions in Cendehill which are not present 
in wild-type strains. The stem-loop region which includes 
a 5' non- translated region and extends into the 
non-structural open reading frame (ORF) , contributes to 
joint growth restriction. This region has been shown to be 
20 important in viral viability and virulence in some 
a-viruses, including Sindbis virus and rubella virus 
(Niesters & Strauss, 1990, Pogue et al . , 1993, Pugachev & 
Frey, 1998) . 

In the 3' subgenomic region, which includes the 
25 structural gene region, Cendehill strain contains 
67 substitutions relative to the Therien strain: three in 
the non- translated region (NTR) upstream of the 
translat ional start site of the subgenomic RNA, two in the 
3 ' NTR, and the remainder in the coding region. Many of the 
30 substitutions in the structural genes occur as the third 
base of a codon and do not affect the amino-acid 
composition, leaving 16 substitutions in the 

1062 amino-acids comprising the structural genes (nine of 
which are also found in the M33 strain) . The substitutions 
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include two substitutions in the capsid protein, two in the 
S2 glycoprotein and four in the El glycoprotein. 
Modifications to the Cendehill structural genes (for 
example, by site specific mutagenesis, linker- insert ion 
5 mutagenesis or homologous recombination) to provide a 
strain with higher immunogenic i ty while retaining the 
attenuating characteristics of Cendehill can therefore be 
carried out . 

Brief Description of the Drawings 

10 Figure 1 is a schematic showing the organization of 

the rubella virus genome. The RNA is pclyadenylated ( A^ ) 
and both the genomic and sub-genomic species are capped 
(CAP) . 

Figure 2 describes the oligonucleotide primers used 
15 for reverse transcription of Rubella virus RNA and 
amplification of cDNA. Identification numbers for each 
primer appear on the left. Viral genome positions 
corresponding to nucleotide positions in Appendix I for 
seven of the primers, appear on the right. 

20 Figure 3 is a schematic showing four Cendehill cDNA 

fragments used to construct chimeric viruses and an 
Cendehill infectious clone, beneath a general 
representation of the viral genome. Restriction sites are 
identified and location of sites used for construction are 

25 indicated by the dotted lines. Primers used to generate 
each cDNA fragment are indicated by primer identification 
numbers (from Figure 2) at fragment termini. 

Figure 4 is a schematic showing the modified 
polycloning site of pCLPC, which is derived from pCL1921. 

30 Figure 5 is a schematic of a cloning strategy for 

production of Cendehill and Cendehill chimeric clones. 
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Cendehill double stranded (ds) cDNA fragments are cut using 
the appropriate restriction enzymes and inserted 
sequentially into similarly restricted regions of pROBO302. 

Figure 6 is a schematic comparing pROBO302 to a 
5 full-length Cendehill clone (pJCND) and two Cendehill 
chimeras (pR0C3) and pR0C3M) . Regions without 

cross-hatching are Therien and cross-hatched regions are 
Cendehill . 

Figure 7 shows predicted 5' stem loop structures of 
10 rubella RNA ' s generated by the Mfold™ 3.0 program using the 
published default settings and for linear RNA. Figures 7A, 
7B and 7C are for Cendehill, wild-type and RA27/3, 
respectively. The wild- type structure shown in Figure 7B 
is the same for the Therien and M33 strains and also the 
15 HPV77 vaccine. 

Figure 8 is a schematic showing the non- structural 
gene region and the position of amino acid substitutions in 
the Cendehill strain relative to Therien. Bars indicate 
mutations described by single letter amino acid codes. 

20 Figure 9 is a schematic showing the structural genes, 

glycosylation sites and the position of the amino acid 
substitutions in the Cendehill strain as compared to 
Therien, including those shared with M33 strain (unshaded 
bars) , Solid bars indicate mutations unique to Cendehill. 

2 5 Detailed Description of Embodiments of the Invention 

An infectious clone comprising a cDNA copy of all of 
the RNA of the Cendehill strain of rubella virus was 
produced as described below. 

Isolation of Viral RNA 
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Cendehill virions were obtained by pelleting 
supernatant virus from the medium of Vero cells infected 
with Cendehill virus (Rohm Pharma) for 4 hours @ 18000 rem 
in a Sorval™ centrifuge. Viral RNA was isolated by 
5 extraction with acidified phenol /guanidinium isothiocyanate 
using Trizol* M (Gibcc/3RL: according to the manufacturer's 
instructions. RNA was precipitated from the aqueous phase 
by the addition of isoprcpyl alcohol (1:1) and washed witn 
75% ethanoi diluted in DEPC-treated H 2 0 prior to drying and 
10 resuspension in DEPC-treated ddH 2 0. 

Reverse Transcription 

Specific primers complementary to the published 
sequence of the Therien strain were used to initiate the 
first strand of DNA synthesis. The primers used were #16, 

15 38 and 125 (Figure 2) . For each reaction, the primer was 
mixed with viral RNA in H 2 0 (total volume and heated 

for 3 min @ 90°C. RNA was then transcribed using 200U of 
Superscript II™ (Life Technologies) . The standard reaction 
mixture contained lOmM dithiothreitcl and ImM dNTPs . The 

20 volume was brought to 100/xl by addition of TE buffer and 
heated to 90°C to inactivate the reverse transcriptase. 
Enzyme, primers and excess nucleotides were removed by 
extraction of the mixture with phenol /chlorof orm/isoamyl 
alcohol (25:24:1, by volume), followed by precipitation at 

25 -20°C in 0.3M sodium acetate and 66% ethanoi. 

Thermal Cycling Amplification 

After generation of the first strand of DNA by reverse 
transcription, double stranded cDNA was made by thermal 
cycling amplification with a Minicycier™ (MJ Research) 
30 using the specific primers (described in Figure 2 according 
to the scheme shown in Figure 3) and repeated cycles of 
incubation with Deep Vent™ (NEE! thermostable polymerase 
with 3' -5' proof-reading excnuclease activity. The 
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standard reaction mixture contained 400^M dNTP , 2mM MgS0 4 , 
0 . 5^iM primer and 1 unit of polymerase. The products were 
resuspended in H 2 0 for ligation into the plasmid vector, 
pCLPC, a derivative of pCL1921 with the modified cloning 
5 site shown in Figure 4 . 

Cloning 

Four cDNA fragments (as shown in Figure 3) amplified 
in pCLPC (see Figure 4), were sequentially cloned into the 
Therien infectious clone pRGBO3 0 2 (Pugachev et al . 1997). 
10 The cloning strategy is outlined in Figure 5. To confirm 
insertion of the correct fragments, the sequence of each 
clone was compared with that of pROBO302 and Cendehill cDNA 
sequenced directly following reverse transcription and 
amplification . 

15 Two chimeric strains and a full-length Cendehill clone 

were produced: 

(i) pR0C3 which contains nucleotides 5357 to 9762 of 
Cendehill as shown in Figure 5 and Appendix 1, (including 
the entire structural gene region) and nucleotides 1 to 

20 5356 of the Therien strain (the majority of the 
non-structural genes and 5' non- translated region); 

(ii) pROC3M which contains nucleotides 2803 to 9762 of 
C enc jehill (see Appendix 1) and nucleotides 1-2802 of 
Therien; and, 

25 (iii) pJCND which contains the entire genomic sequence 

of the Cendehill strain (see Appendix 1) . These are shown 
in Figure 6 . 

Screening of Constructs 
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The construcrs were screened by restriction enzyme 
digestion to determine that the inserts were the correct 
size and had the expected restriction pattern. Each clone 
was also screened for infectivity as follows. Small-scale 
5 piasrr.id preparations were carried out by standard 
techniques. These preparations were linearised by 

restriction digestion with EcoRl at the 3' terminus of the 
viral sequence. Posit ive -polarity viral RNA was generated 
by transcription from the SP6 promoter and the products 

1C were transfected into BKK21 ceils by electroporat ion . 
After 2 days the supernatants were transferred to Vero 
cells and supernatant virus was removed for plaque 
titration 4 days later. The 3 constructs all gave titres 
of progeny virus of 10 5 - 10 6 /ml after three serial passages 

15 in Vero cells. The progeny viruses were designated R0C3 , 
R0C3M and JCND . 

Phenotypic Characterisation of the Recombinant Viruses 

Attenuating characteristics examined included 
temperature sensitivity and replication in human joint 
2C cells. 

(1) Temperature sensitivity: At 39°C the Cendehill strain 
is growth- restricted while wild-type strains grow normally. 
This is believed to be an attenuating characteristic as 
growth of Cendehill would be limited in infected patients 

25 by even mild fever induction. All three recombinant 
strains did not grow at 39°C indicating that they have the 
attenuating phenotype . Similarly, measurements of the 
stability of the recombinant strains on prolonged 
incubation at 37°C / relative to the Therien and Cendehill 

30 parental strains, showed that the infectivity of the 
recombinants and Cendehill decreased rapidly to 0.5% of the 
input (a 200 fold reduction) in 50 hours while the 
reduction in Therien was only 10 -fold. 
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(2) Growrh in human joint ceils: Mapping cf the region of 
the genome associated with joint cell restriction was 
carried out by examining the ability of the recombinant 
viruses to replicate after electroporat ion into human 
5 synovial cells cultured according to the method of Miki and 
Chantler (1993) . The results showed that five days 
following electroporat ion , the supernatant titre of pR0C3 
was the same as that for pROB0302 (the Therien clone) . The 
titre of elect roporated pR0C3M was 10-fold lower and no 

10 growth was seen with pJCND on transfection of 0.5 |ig of RNA 
in each case (see Table 1) . Therefore the regions of the 
Cendehill genome containing sequences involved in joint 
cell restriction include nucleotides 2803 to 5355, which 
are present in pROC3M but not pROC and the 5' end of the 

15 genome; nucleotides 1 to 2803 which are specific to pJCND. 
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Table I 



Rubella Virus Strain Virus yield 

(pfu/ml) 

5 



Therien 


4 


0 x 10 4 


Cendehili 


1 


5 x 10 1 


pROBO3 02 


1 


9 x 10 3 
5 x 10 3 


pROC3 


2. 


pROC3M 


2 


4 x 10 2 


pJCNDl 


no 


virus detected 


PJCND2 


no 


virus detected 



Sequence Analysis 



15 Further definition of the nucleotide substitutions 

involved in attenuation was determined by sequence 
analysis. The entire cDNA sequence corresponding to the 
Cendehili genome was determined using an automated 
sequencing system at the NAPS unit at the University of 

20 British Columbia employing Amplitaq Dye Terminator Cycle™ 
sequencing reagents (ABI) and by analysing the fluorescent 
products spectrophotometrically . The sequence obtained is 
shown in Appendix 1. It was compared with the published 
sequences of Therien strain (Dominguez et al . , 1990, later 

25 corrected in Pugachev et al . , 1997), a consensus M33 
sequence (Clarke et al. , 1987, Zheng et al . , 1989 and 
Pugachev, 1997) and the RA27/3 sequence (Pugachev et al . 
1997) . Nucleotide substitutions specific to Cendehili 
strain in the area of the first 5'NTR, the non- structural 

3 0 and structural genes, and in the 3 ' NTR are described in 
detail below, in which the nucleotide numbering is 
according to the whole genome shown in Appendix 1 and the 
amino acid numbering is according to the polyproteins as 
described above. 
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5 ' Non- translated Region (NTR) ana Stem-Loop Region 

Two substitutions as shown in Table II were identified 
in this area. 

Table II 

5 nucleotide 37 : U to C 

nucleotide 55 : A to G 



These substitutions are in a stem-loop region that is 
believed to be important in controlling viral replication 
10 and translation. Alterations in this region destabilize 
the stem structure and may affect binding of cellular or 
viral factors important in viral replication. 

The stem loop structure may be predicted by computer 
programs intended to generate representations of folded 

15 structures. For the purposes of this specification, stem 
loop structures are determined by use of the Mfold™ 3.0 
program from Dr. Michael Zuker, Washington University 
School of Medicine (see: M . Zuker, et al . ; Algorithms and 
Thermodynamics for RNA Secondary Structure Prediction: A 

20 Practical Guide in RNA Biochemistry and Biotechnology, 
J. Barciszewski & B.F.C. Clark eds . , NATO ASI Series, 
Kluwer Academic Publishers (1999) ; and D.H. Mathews, et al . 
(1999) Expanded Sequence Dependence of Thermodynamic 
Parameters Provides Robust Prediction of RNA Secondary 

25 Structure J. Mol . Biol. 288, 911-940). The Mfold 3.0 
program may also be obtained on the Internet at: 
http: //mfold2 .wustl . edu/ -mfold/ cgi -bin/nph -mfold- 3 . Ocgi . 
The mfold program default settings are used with the 
imputed RNA sequence being designated as linear. 

30 As shown in Figure 7A, the alteration at nucleotide 37 

is in the terminal loop of the stem. With reference to 
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Figures 7A-C, the terminal loop cf Cendehill is altered as 
ccmcared to the predicted terminal loop of both wild-type 
and RA27/3 strains. As is also shown in Figure 7, the 
substitution at nucleotide 55 increases the size of the 
5 bulge in the stem of Cendehill as compared to the bulges cf 
wild-type or RA27/2. As is shown in Figure 7, the medial 
loop of Cendehill is altered as compared to the medial loop 
which appears in both wild-type and RA27/3. 

Attenuation of the wild-type rubella phenotype is 
10 expected upon alterations in the nucleotide region 15-65, 
particularly m the regions 20-28, 33-43 and 52-60. 
Alterations which increase the size of the bulge such that 
a bulge to one side of the stem has at least four unpaired 
nucleotides (such as is shown in Figure 7A) is also 
15 associated with the Cendehill phenotype. 

Non- structural Gene (NSG) Region 



20 



Several mutations are found between nucleotides 2800 
and 4550, including 5 mutations specific to the Cendehill 
strain which are present in pROC3M but not in pROC and are 
therefore associated with a significant restriction in 
joint cell growth as described in Table I. These mutations 
are delineated in Table III: 



Table III 



P150 nucleotide 2829 G to A aa 929 



cys - tyr 



25 P150 nucleotide 3060 A to G aa 1006 asp - gly 



P150 nucleotide 3164 U to C aa 1041 tyr - his 



P150 nucleotide 3528 C to U aa 1162 



ala - val 



P90 nucleotide 4530 C to U aa 1496 



thr - ile 
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Two of the NSG mutations lie within or in proximity to 
a region of homology with the alphavirus NSP3 domain while 

the other two are in the protease domain and on either side 

of cys 1151 at the catalytic site. The p90 mutation is in 
5 the helicase domain. 

In addition to the foregoing, there are two mutations 
in the NSG region shown in Table IV which do not alter the 
encoded amino-acid but may influence infectivity due to 
changes in RNA structure. 

10 Table IV 

- nucleotide 118 C to U 

(This substitution may be involved in 
stem-loop structures at the 5 'end) 

- nucleotide 358 U to C 

!5 (This substitution is in the region of rubella 

RNA involved in binding to the capsid protein 



Structural Gene (SG) Region 

The structural genes of rubella virus are produced 
20 from a 3327 nucleotide subgenomic RNA as represented in 
Figure 1. It consists of a short (78 nucleotide) 
5' non- translated region (NTR) , the structural genes which 
are translated from a single open-reading frame (ORF) and 
a short 3 ' NTR . Both the 3' and 5' NTRs are capable of 
25 forming stem-loop structures, can bind host cell proteins 
and are believed to be important in viral replication. In 
the entire subgenomic RNA, 67 nucleotide substitutions were 
identified in Cendehill strain when compared with the 
Therien strain (see Appendix 1) . Two are in the 5 ' NTR 
30 upstream of the translat lonal start site, two in the 3 ' NTR 
and the remainder are in the coding region. Many of the 
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substitutions in the structural genes occur as the third 
base of a codon and do not affect the ammo-acid 
composition, leaving it substitutions in the 
1062 amino-acids comprising the structural genes, eight of 
5 which are also found in the M33 strain. The remaining 
S aminc^ acid substitutions are net found in the HPV77/DE5 
cr RA27/3 vaccine strains either. The nucleotide/amino 
acid substitutions specific tc the Cendehill strain (other 
than the 5 ' NTR substitutions) are shown in Table V(a) - (d) 
10 in which the amino acid numbering is according to the 
polyprotein . 

Table V(a): Protein C Region 

nucleotide 6611 U to C aa 34 ser-pro 

nucleotides 6770 A to G aa 87 thr-gly 

15 6771 C to G 



The substitution at aa34 occurs within a stretch cf 
28 amino-acids (28-56) believed to be important in binding 
of protein C to viral RNA during encapsidat ion . A region 
20 between amino-acids 64 and 97 has been shown to react with 
a monoclonal antibody, indicating that this is an antigenic 
region although not one of the reported major antigenic 
sites . 

Table V(b): Protein E2 Region 

25 nucleotide 7428 C to U aa 306 ala-val 

nucleotides 7746 C to U aa 413 thr-ile 
7747 G to U 
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The alanine Co valine substitution at aa306 is a 
conservative change but lies within the first 26 residues 
of protein E2 , a region which has been identified as a 
neutralising domain. The two changes at nucleotides 7746 
5 and 7747 result in the loss of a Asn-X-Thr glycosylat ion 
site, one of four N- linked glycosylat ion sites found in 
Therien strain. The literature is conflicting as to 
whether the latter substitution is present in M33 . 

Table V(c) : Protein El 

10 nucleotides 8786 A to G aa 759 asn-asp 

87 8 8 C to U 
nucleotide 8864 C to A aa 785 leu-met 
nucleotide 9180 A to U aa 890 his-leu 
nucleotide 9254 G to A aa 915 ala-thr 

15 

The four alterations in El all occur in the region of 
the protein which is extruded into the lumen of the 
endoplasmic reticulum, and is therefore also exposed on the 
surface of the mature virion. The first substitution at 

20 amino-acid 759 alters an asparagine to an aspartic acid 
residue with the resulting loss of an N-linked 
glycosylat ion site, one of three in El, all of which are 
believed to be utilised. None of the substitutions in El 
are in regions identified as dominant epitopes of the 

25 cell -mediated immune response, nor in regions identified by 
monoclonal antibodies as being associated with 
hemagglutination or neutralisation. However they may alter 
conformation-dependent epitopes associated with the 
humoral response affecting the immunogenici ty of Cendehill 

30 strain which reacts poorly with polyclonal antisera to the 
Therien strain in immunoprecipi tat ion and immunoblot 
assays . 
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Table V 


; a 1 : 3 


' NTR 






nuclec: ide 


9 731 


G 


CO 




nucleotide 


9740 


C 


to 


U 


nucleotide 


9741 


c 


to 


U 



This region, like the 5 ' NTR is involved in RNA 
replication. Although the substitutions at nucleotides 
9731 and 9740 are also found in the M33 strain, they may 
affect attenuation as M33 is a less cytopathic strain than 
10 Therien, 

The substitutions identified in the structural genes 
of Cendehill are responsible for the lower antigenicity 
and immunogenicity of this strain relative to Therien, M33 
or RA27/3, Using the Cendehill infectious clone, 

15 alterations to the structural genes (for example, by 
site-directed metagenesis) would enable the antigenicity of 
this strain to be repaired. This would provide a novel 
rubella strain with the attenuating phenotype of Cendehill, 
including restriction of growth in joint cells, but with 

20 the immunogenic properties of either a wild strain like 
Therien or the RA27/3 vaccine strain. Alternatively, a 
chimeric strain can be produced comprising (for example) 
the entire structural gene region of RA27/3 inserted into 
the Cendehill infectious clone. Either of these constructs 

25 would provide an improved attenuated rubella vaccine. 

Production of Modified Rubella Virus Strains 

Altered strains can be produced by standard 
recombinant DNA technology as described in many current 
textbooks including "Molecular Cloning: A Laboratory 
3 0 Manual," edited by Maniatis,T., Fritsch E.F., and 
Sambrook, J., (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, N . Y . 1989) or "Current Protocols in Molecular 
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Biology" edited by Ausubel et al . , (Wiley Interscience, 
1987) . 

To alter specific nucleotides in the structural gene 
region, ol igonucl eot ide - directed mutagenesis and gene 
5 amplification technology can be used as described by 
Kiguchi (1989). This procedure involves synthesis of 
oligonucleotides specific for the region to be modified, 
containing the required nucleotide substitution, as well as 
an appropriate restriction site. This can then be used as 

10 one primer for a gene amplification reaction encompassing 
the region of interest. A second primer is chosen which 
includes a unique restriction site and which will yield a 
fragment of suitable size. Following amplification of the 
fragment which now has the requisite nucleotide 

15 substitution incorporated, the fragment is cloned into the 
infectious clone replacing the original sequence. In this 
way, mutations can be incorporated into the gene sequence 
either singly or sequentially until the resulting virus has 
the properties wanted. 



2 0 Production of Chimeric Virus Strains 



A cDNA clone including the entire structural gene 
region of a rubella stain such as RA27/3 can be made in the 
following steps: (i) isolation of viral RNA from 

high-titre virus stock, (ii) first strand cDNA synthesis 

25 using a specific primer for the 3 'end, (iii) amplification 
of the structural gene region using primers Fl and 18 
(Figure 2), (iv) digestion of the amplified fragment and 
also pCND with Bgl II and EcoRl , and (v) cloning of the 
amplified fragment into pJCND (previously separated from 

30 its digested insert) . 



Following the above-described scheme, a chimeric 
Cendehill/RA27/3 clone whose genome includes a first 
portion which is equivalent to the Cendehill 5" non- 
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translated RNA, Cendehill pi 5 0 and pSO and a second portion 
equivalent to the structural gene region and tne 3' 
ncn- translated region of RA27/3 strain was made. This 
clone can be used to produce a chimeric virus that 
5 expresses the structural proteins of RA27/3 but has tne 
determinants of arthrct rcpi sm found in the genetic 
structure at the 5' end and m the non- structural genes cf 
Cendehill strain . 

This construct was produced by synthesising a cDNA/PCR 
10 fragment, using RA27/3 RNA as template, equivalent to the 
18-F1 fragment shown in Figure 2. This fragment was then 
inserted into the Cendehill infectious clone using the 
restriction enzymes Bglll and EcoRl , in an identical manner 
to the synthesis of pROC3 described elsewhere in this 
15 specification. The new chimeric clone was sequenced 
through nucleotides 6611 and 6770/6771 as well as through 
nucleotides 8786/8788 and 8864 to ensure that replacement 
of the 18-F1 fragment had occurred. The published sequence 
of RA27/3 indicates that the latter strain has the same 
20 nucleotides as Therien strain at these positions (Pugachev 
KV, Abernathy ES and Frey TK . Archives of Virology 142 
1165-1180, 1997: Genomic sequence of the RA27/3 vaccine 
strain of rubella virus) while Cendehill is modified in 
these regions as disclosed herein. 

2 5 Screening of Novel Rubella Strains 

Modified cDNA clones incorporated in the pCL1921 
plasmid can be transcribed into complete infectious RNA 
from the SP6 promoter. The RNA produced can be transfected 
into BKK-21 cells by a variety of techniques including 
30 electroporat ion or use of Lipof ectamine™ (Gibco/3RL) . The 
transfected RNA is translated and replicated in the ceil to 
yield virus with altered phenotypic properties according to 
the mutations introduced. In this way, seed stocks of 
rubella strains of this invention may be produced. 
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Phenotypic properties of rubella strains of this 
invention can be monitored for characteristics associated 
with attenuation and immunogenicity . For example, yield, 
temperature sensitivity and the ability to grow in human 
5 joint tissue can be determined as described previously for 
pR0C3 and pR0C3M. The antigenicity of the strains can be 
assessed using standard enzyme - linked immunosorbent assays, 
immunoprecipi tat ion assays and immunoblots with human 
rubella seropositive antisera. The efficacy of a strain 

10 for eliciting a strong neutralising antibody response can 
be measured in rabbits and compared with the current 
vaccine strain, RA27/3 and also the parental Cendehill 
strain. In this way, novel strains can be assessed for 
characteristics that would make them suitable for use as 

15 improved attenuated vaccines. 

Attenuated rubella strains may be used as a seed stock 
for manufacturing vaccine. Virus from such a stock may be 
combined with a variety of stabilisers such as saline, 
phosphate buffer, polyethylene glycol, glycerin as 
20 currently used in vaccine preparations.. The vaccine may be 
produced in lyophilised form to aid long-term preservation. 
It can also be combined with other vaccines such as mumps 
and measles vaccines as in the current M-M-R formulation. 

In addition to use of rubella virus strains of this 
25 invention as live attenuated vaccines as described above, 
modified infectious cDNA clones may also be used to produce 
a DNA vaccine against rubella virus, either singly or in 
combination with other DNA vaccines. For this, the cDNA of 
the rubella virus strain is sub-cloned into an expression 
30 vector (either plasmid or viral) which contains a suitable 
eukaryotic promoter. Either the entire rubella virus 
genome, the structural genes or immunogenic regions of the 
structural genes can be used in this manner to directly 
immunise patients. The DNA vaccine is taken up by cells 
35 and transcribed from the eukaryotic promoter to yield RNA 
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which is translated into viral pre terns. These in turn 
elicit an immune response. 



Other uses of the Cendehill infectious clone and its 
derivatives include the production of large quantities of 
5 virus for use as antigen in enzyme - 1 inked immunosorbent 
assays to assess human antibody levels against rubella. In 
view of variations in the antigenicity of the different 
rubella virus strains, it would be preferable to use 
antigen known to react optimally according to the vaccine 

1C strain delivered. For example, a virus strain with the 
structural gene region identical to the vaccine in use, but 
altered in the non- structural genes or NTR regions to 
improve viral yield for antigen production may be 
propagated. Subsequently, the strain for use in 

15 immunoassays would be treated to produce a non- infect ious 
antigen preparation. Alternatively, the structural 

proteins alone could be produced from a suitable expression 
vector to yield an antigen preparation with the correct 
specificity. 

20 Although various aspects of the present invention have 

been described in detail, it will be apparent that changes 
and modification of those aspects described herein will 
fall within the scope of the appended claims. All 
publications and references referred to herein are hereby 

25 incorporated by reference. 
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APPENDIX 1 
Sequence of Cendehill virus cDNA 

5' NTR r+pl50 

1 CAATGGAAGC TATCGGACCT CGCTTAGGAC TCCTATCCCC [ATG GAG AAA 

5 0 CTC CTG GAT GAG GTT CTT GCC CCC GGT GGG CCT TAT AAC TTA ACC GTC GGC 

1C1 AGT TGG GTA AGA GAC CAT GTC CGC TCA ATT GTC GAG GGC GCG TGG GAA GTG 

111 CGC GAT GTT GTT ACC GCT GCC CAA AAG CGG GCC ATC GTA GCC GTG ATA 

20 0 CCC AGA CCT GTG TTC ACG CAG ATG CAG GTC AGT GAT CAC CCA GCA CTC CAC 

25 1 GCA ATT TCG CGG TAT ACC CGC CGC CAT TGG ATC GAG TGG GGC CCT AAA GAA 

3C: GCC CTA CAC GTC CTC ATC GAC CCA AGC CCG GGC CTG CTC CGC GAG GTC 

3 50 GCT CGC GTC GAG CGC CGC TGG GTC GCA CTG TGC CTC CAC AGG ACG GCA CGC 

401 AAA CTC GCC ACC GCC CTG GCC GAG ACG GCC AGC GAG GCG TGG CAC GCT GAC 

451 TAC GTG TGC GCG CTG CGT GGC GCA CCG AGC GGC CCC TTC TAC GTC CAC 

502 CCC GAG GAC GTC CCG CAC GGC GGT CGC GCC GTG GCG GAC AGA TGC TTG CTC 

551 TAC TAC AC A CCC ATG CAG ATG TGC GAG CTG ATG CGC ACC ATT GAC GCC ACC 

602 TTG CTC GTG GCG GTT GAC TTG TGG CCG GTC GCC CTT GCG GCC CAC GTC 

65 C GGC GAT GAC TGG GAC GAC CTG GGC ATT GCC TGG CAT CTC GAC CAT GAC GGC 

701 GGT TGC CCC GCC GAT TGT CGT GGA GCC GGC GCT GGG CCC ACG CCC GGC TAC 

752 ACC CGC CCC TGC ACC ACA CGC ATC TAC CAA GTC CTG CCG GAC ACC GCC 

800 CAC CCC GGG CGC CTC TAC CGG TGC GGG CCC CGC CTG TGG ACG CGC GAT TGC 

851 GCC GTG GCC GAA CTC TCA TGG GAG GTT GCC CAA CAC TGC GGG CAC CAG GCG 

90 2 CGC GTG CGC GCC GTG CGA TGC ACC CTC CCT ATC CGC CAC GTG CGC AGC 

950 CTC CAA CCC AGC GCG CGG GTC CGA CTC CCG GAC CTC GTC CAT CTC GCC GAA 

1001 GTG GGC CGG TGG CGG TGG TTC AGC CTC CCC CGC CCC GTG TTC CAG CGC ATG 

10 52 CTG TCC TAC TGC AAG ACC CTG AGC CCC GAC GCG TAC TAC AGC GAG CGC 

HOC GTG TTC AAG TTC AAG AAC GCC CTG AGC CAC AGC ATC ACG CTC GCG GGC AAT 

1151 GTG CTG CAA GAG GGG TGG AAG GGC ACG TGC GCC GAA GAA GAC GCG CTG TGC 

1202 GCA TAC GTA GCC TTC CGC GCG TGG CAG TCT AAC GCC AGG TTG GCG GGG ATT 

1253 ATG AAA GGC GCG AAG CGC TGC GCC GCC GAC TCT TTG AGC GTG GCC GGC TGG 

13 04 CTG GAC ACC ATT TGG GAC GCC ATT AAG CGG TTC TTC GGT AGC GTG CCC CTC 

13 55 GCC GAG CGC ATG GAG GAG TGG GAA CAG GAC GCC GCG GTC GCC GCC TTC 
1403 GAC CGC GGC CCC CTC GAG GAC GGC GGG CGC CAC TTG GAC ACC GTG CAA CCC 

14 54 CCA AAA TCG CCG CCC CGC CCT GAG ATC GCC GCG ACC TGG ATC GTC CAC GCA 
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1505 GCC AGC GCA GAC CGC CAT TGC GCG 

155 3 CGC GAA CGT CCT TCC GCG CCT GCC 

1604 CCG CCG TGG CTG TTC GCC GAG CGC 

16 5S TTC GAG GCT CTC CGC GCG CGC GCC 

17 0 3 CTG GCT CCA CGC CCT GCG CGG TAC 
17 54 CAC CAC GGT CCG TGG CTC ACC CTT 

1605 CTG GTC TTA TGC GAC CCA TTT GGC 
1853 CAC TTC GCC GCC GGC GCG CAT ATG 
19 04 TTT GTC CGT GTC GTG CCT CCA CCC 
19S5 AGA GCG TGG GCG AAG TTC TTC CGC 
200 3 CTC GGC GAG CCG GCA GTC ATG CAC 
2054 CAG CTG ATC GCA CTG GCC TTG CGC 

21 OS GCA CTC TCG GTG CGT GAC CTG CCC 
215 3 GCG GTC ACC GCC GCC GTG CGC GCT 

22 04 CCG CCA CCC GGC GAC CCC CCG CCG 
22 5 5 CAC TCG GAC GCC CGC GGC ACT CCG 
2 303 CCG CCG CCC GCC CCC AGC CCG CCC 
2 3 54 CCT CCC ACT CCC GCG GAG CCG GCG 
2405 GTC GCC TAC GAA CCG AGC GGC CCC 

24 5 3 GAC AGC GAC ATC GTT GAA AGT TAC 

25 04 CGA GTC CGC GAC ATC ATG GAC CCA 
2555 GCC GCC AAC GAG GGG CTG CTG GCC 
2603 TTT GCC AAC GCC ACG GCG GCC CTC 
26S4 TGC CCC ACC GGC GAG GCG GTG GCG 
2 705 CAC ATC ATC CAC GCC GTC GCG CCG 
2753 CTC GAG GAG GGC GAA GCG CTG CTC 
2804 CTA GCC GCC GCG CGT CGG TGG GCG 
2855 GGC GTC TAC GGC TGG TCT GCT GCG 
29 0 3 GCT ACG CGC GCC GAG CCC GTC GAG 

2 9 54 GAC CGC GCC ACG CTG ACG CAC GCC 
3005 GCC AGG CGC GTC AGT CCT CCT CCG 
305 3 GCC GGT GGC CCG GGC CGA CCG GCT 

3 104 CCC CTT GGG GAT GCC ACC GCG CCC 



TGC GCT CCC CGC TGC GAC GTC CCG 
GGC CCG CCG GAT GAC GAG GCG CTC ATC 
CGT GCC CTC CGC TGC CGC GAG TGG GAT 
GAT ACG GCG GCC GCG CCC GCC CCG 
CCC ACC GTG CTC TAC CGC CAC CCC GCC 
GAC GAG CCG GGC GAG GCT GAC GCG GCC 
CAG CCG CTC CGG GGC CCT GAA CGC 
TGC GCG CAG GCG CGG GGG CTC CAG GCT 
GAG CGC CCC TGG GCT GAC GGG GGC GCC 
GGC TGC GCC TGG GCG CAG CGC TTG 
CTC CCA TAC ACC GAT GGC GAC GTG CCA 
ACG CTG GCC CAA CAG GGG GCC GCC TTG 
GGG GGT GCA GCG TTC GAC GCA AAT 
GGC CCC GGC CAG CTC GCG GCC ACG TCA 
CCG CGC CGC GCA CGG CGA TCG CAA CGG 
CCC CCC GCG CCT GTG CGC GAC CCG 
GCG CCA CCC CGC GCG GGT GAC CCG GTC 
GAT CGC GCG CGT GAC GCC GAG CTG GAG 
CCC ACG TCA ACC AAG GCA GAC CCG 
GCC CGC GCC GCC GGA CCT GTG CAC CTC 
CCG CCT GGC TGC AAG GTT GTG GTC AAC 
GGC TCC GGC GTG TGC GGT GCC ATC 
GCT GCA GAC TGC CGG CGC CTC GCC CCA 
AC A CCC GGC CAC GGC TGC GGG TAC ACC 
CGG CGT CCT CGG GAC CCC GCC GCC 
GAG CGC GCC TAC CGC AGC ATC GTC GCG 
TAT GTC GCG TGC CCC CTC CTC GGC GCT 
GAG TCC CTT CGA GCC GCG CTC GCG 
CGC GTG AGC CTG CAC ATC TGC CAC CCC 
TCC GTG CTC GTC GGC GCG GGG CTC GCT 
ACC GAG CCC CTC GCA TCT TGC CCC 
CAG CGC AGC GCG TCG CCC CCA GCG ACC 
GAG CCC CGC GGA TGC CAG GGG TGC GAA 
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3155 


CTC 


TGC 


CGG 
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GGC 
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GTG 
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GGC 
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GTG 
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GTG 
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GGC 


CAC 
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GTC 
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GTC 
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39C5 


CCC 
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CTT 
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TGC 
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CCC 


GTC 


CCT 


GAC 


CGC 


4454 


TGG 


CGC 


TTC 


CCC 


GAC 


TGC 


TGG 
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4SC5 


GAC 


ATC 


GAG 
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GAG 


CGC 


ACC 


GGC 
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GGC 
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GAC 


CTT 
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4604 


CTT 
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GAG 
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CGC 
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CGC 


GAC 


CTC 


475 4 


GAC 


GCC 


CTC 


TAC 


CTC 


CAC 


GAG 


CTC 



AAT GAC CGC GCC TAC GTC AAC CTG 

ACC AGC TGG GCG ATG CGC ATT CCC GAG 

CTC GCC ACG CAT TTT CCA CTA AAC CAC 

GTC AG3 CCC CCG CGA GGC ATG TGC 

GGC TGG CAG GGC ATG CCG CAG GTG CGG 

GCC CTC TGC CGC ACA GGC GTG CCC CCT 

CTA GAC CCA AAC ACC TGC TGG CTC 

GTT GCG CGC GCC TGC GGC GCC TAC ACG 

TAC GO: CGC GCC CTG AGC GAA GCC CGC 

AGC CAG CGG TGG AGC GCG AGC CAC 

GGA GAC CCC CTC GAC CCC CTG ATG GAG 

GTA TGC GTC GGC TCC GAG CAA GAG GCC 

CTC CAC CGT GCC CCC AAT GGT CCG 

GCG CGC CCC GAG GGG GGC AAC CCC ACC 

GGC GGC CCA CGC CGC GTC TCG GAC CGC 

Pp90 

ACT TGT 

CAG GCG TAC TAC GAC GAC CTC GAG GTG 
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4 805 CTC AGC GCG TTC CTC 

4 853 CCC GCC GGC ATT GAC 

4 904 CCG CCC GCC GAC GGC 

4 95 5 CGC ACT CTA GAG GAG 
500 3 GCG GAC CTC AAC CGC 
S054 ATC TCG CGT CAC CTG 
5105 CGC GTT CTC AGT GCC 
5156 GGG TCG ACC CTC CGC 

5 2 0-7 CAG ATC CCA CCC CCG 
5256 ACG TAC TTG CGG GAA 
5306 GGC GTG GCC GCG CGG 
5357 ATC TTT GCC GGC ATG 
5408 AAA GCC ACC TTG AAG 
5456 GAG GAC TGC CAC GCC 
5507 GCC AAG GAG TGG GTC 
5558 ATT ATC ATG CGC GCC 
56C6 ACG GAG CCC GAG GTC 
5657 ATC GAG GTC GAC TTC 
57 oe GAC GTC GAG CTC GAG 
5756 GAA GAC TAC CGC GCG 
S807 GGC TCC ACT GAG ACC 
58 58 CTG CAC AAC ACC ACC 
5906 AAA GGC GTG CGC TGG 
5957 CTC CCC GAG GGC GCG 
600e GGC TTG TTC GGC TTC 
6056 CCC AGC TTC TGC GGG 
6107 ATG CAC CAG GCA ATC 
6158 GAA GAA CAG CAG GTG 
62 06 GCT CTG CCT GAC ACC 
62 57 GAG CGC GTC CTC GCT 

6 3 06 GGC CTC GAC CAC CCG 

6 3 56 CCC TAC GCG CGC GCC 
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CGG 


GGG 


CGC 


GAG 
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TAA 


CGC CCC 


CGT 


ACG 


TGG 
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p- ► subgenome ( NTR ) 

64C^ GGC CTT TAA TCT CAC CTA CTC TAA CCA jGGTCATCACC ChCCGTTGT? 

64 SI TCGCCGCATC TGGTGGGTAC CCCACTCTTG CCATTCGGGA GAGCCCCAGG GTGCCCGA 



6 5 0C ATG GCT TCC ACT ACC CCC ATC ACC ATG GAG GAG CTT CAG AAG GCC 
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■7956 CTG GTC GTT CTT ACC GCC CGC CCC GAA GAC GGC TGG ACT TGC CGC GGC 

8006 GTG CCC GCC CAT CCA GGT ACC CGC TGC CCC GAA CTG GTG AGO CCC ATG GGA 

8057 CGC GOG ACT TGC TCC CCA GCC TCG GCC CTC TGG CTC GCC ACA GCG AAC GCG 

8106 CTG TCT CTT GAC CAC GCG CTC GCG GCC TTT GTC CTG CTG GTC CCG TGG 

8156 GTC CTG ATA TTT ATG GTG TGC CGC CGC GCC TGT CGC CGC CGC GGC GCC GCC 

82 0-7 GCC GCC CTC ACC GCA GTC GTC CTG CAG GGG TAC AAC CCC CCC GCC TAT GGC 
[—►El 

8258 GAG GAG GCT TTC ACC TAC CTC TGC ACT GCA CCG GGG TGC GCC ACT CAA 



8 30 6 ACA CCT GTC CCC GTG CGC CTC GCT GGC GTC CGC TTT GAG TCC AAG ATC GTG 



8408 TGC GAG ATC CCC ACT GAT GTC TCG TGC GAG GGC TTG GGG GCC TGG GTA 

8456 CCC ACA GCC CCT TGC GCG CGC ATC TGG AAT GGC ACA CAG CGC GCG TGC ACC 

8 507 TTC TGG GCT GTC AAC GCC TAC TCC TCT GGC GGG TAC GCG CAG CTG GCC TCT 

8 55 8 TAC TTC AAC CCT GGC GGC AGC TAC TAC AAG CAG TAC CAC CCC ACC GCG 

8606 TGC GAG GTT GAA CCT GCC TTC GGA CAC AGC GAC GCG GCC TGC TGG GGC TTC 

8 657 CCC ACC GAC ACC GTG ATG AGC GTG TTC GCC CTT GCT AGC TAC GTC CAG CAC 

8 70 8 CCT CAC AAG ACC GTC CGG GTC AAG TTT CAT ACA GAG ACT AGG ACC GTC 

87 5 6 TGG CAA CTC TCC GTA GCC GGC GTG TCG TGC GAT GTC ACC ACT GAA CAC CCG 

8 807 TTC TGC AAC ACG CCG CAC GGA CAA CTC GAG GTC CAG GTC CCG CCC GAC CCT 

68 5 8 GGG GAC ATG GTT GAG TAC ATT ATG AAT TAC ACC GGC AAT CAA CAG TCC 

8906 CGG TGG GGC CTC GGG AGC CCG AAC TGT CAT GGC CCC GAT TGG GCC TCC CCG 

8 957 GTT TGC CAA CGC CAT TCC CCT GAC TGC TCG CGG CTT GTG GGG GCC ACG CCA 
9008 GAG CGT CCC CGG CTG CGC CTG GTC GAC GCC GAC GAC CCC CTG CTG CGC 
9056 ACT GCC CCT GGG CCC GGC GAG GTG TGG GTC ACG CCT GTC ATA GGC TCT CAG 
9107 GCG CGC AAG TGC GGA CTC CAC ATA CGC GCT GGA CCG TAC GGC CAT GCT ACC 
9158 GTC GAA ATG CCC GAG TGG ATC CTC GCC CAC ACC ACT AGC GAC CCC TGG 

92 06 CAC CCA CCG GGC CCC TTG GGG CTG AAG TTC AAG ACA GTT CGC CCG GTG ACC 

92 57 CTG CCA CGC GCG TTA GCG CCA CCC CGC AAT GTG CGT GTG ACC GGT TGC TAC 

9308 CAG TGC GGT ACC CCC GCG CTG GTG GAA GGC CTT GCC CCA GGG GGA GGG 

9 3 56 AAC TGC CAT CTC ACC GTC AAT GGC GAG GAC GTC GGC GCC TTC CCC CCT GGG 
94 07 AAG TTC GTC ACC GCC GCC CTC CTC AAC ACC CCC CCG CCC TAC CAA GTC AGC 
94 58 TGC GGG GGC GAG AGC GAT CGC GCG AGC GCG CGG GTC ATT GAC CCC GCC 
9506 GCG CAA TCG TTT ACC GGC GTG GTG TAT GGC ACA CAC ACC ACT GCT GTG TCG 




8357 GAC GGC GGC TGC TTT GCC CCA TGG GAC CTC GAG GCC ACT GGA GCC TGC ATC 
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955^ GAG ACC CGG CAG ACC TCG GCG GAG TGG GOT GOT GOT GAT TGG TGG CAG CTC 
S606 ACT CTG GGC GCC ATT TOO GCC CTC CCA CTC GCT GGC TTA CTC GCT TGC 

-y 

96S6 TGT GCC AAA TGC TTG TAG TAG TTG CGC GGC GCT ATA GCG CCG CGC TAG ; TGG 
S1Q7 GCCCCCGCGC GAAACCCGCA CTAGCCCACT AGATTTCCGC ACCTGTTGCT GTATAG 
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WE CLAIM : 

1. A nucleic acia corresponding to a nucleic acid 
encoding a Cendehili rubella protein selected from the 
group consisting of: pi50; p90; C; El; E2 . 



5 2. A nucleic acid corresponding to a non- translated 
region of the Cendehili genome. 

3. The nucleic acid of claim 2 wherein the non- translated 
region is a 5' non- translated region in which at least one 
of a terminal loop or a medial loop is different in size as 

10 compared to wild- type rubella 5' non- translated region. 

4. A nucleic acid which includes a sequence or sequences 
of nucleotides corresponding to a 5 ' non- translated region, 
p90 and pl50 of Cendehili. 

5. DNA including a sequence of nucleotides corresponding 
15 to the entire Cendehili genome as shown in Appendix 1. 

6. The nucleic acid of any one of claims 1-4 which is 
DNA. 

7. The nucleic acid of any one of claims 1-4 which is 
RNA . 



20 8. The nucleic acid of any one of claims 1-7 further 
including one or more sequences of nucleotides 
corresponding to all or part of a genome of a rubella 
strain other than Cendehili. 



9. A plasmid or viral vector that includes a nucleic acid 
25 according to any one of claims 1-5 or 8, wherein the 
nucleic acid is DNA . 
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1C . DNA comprising a sequence c 
to rubella genomic RNA capable 
virus of the Cendehili strain 
pher.ct^e comparable to Gender: i 

5 11 . DNA including a first sequence of nucleotides 
corresponding to one cr more of: a non- translated region, 
plBO, p90, C, El and E2 of Cendehili strain; and, a second 
sequence of nucleotides that is derived from a rubella 
virus strain ether than Cendehili, wherein said DNA encodes 
10 an infectious rubella virus. 

12 . DNA comprising sequences of nucleotides corresponding 
to nucleotides 1 to 5355 of Cendehili and nucleotides 5356 
to 9762 of RA27/3. 

13. The DNA of claim 10, 11 or 12, ma plasmid or viral 
15 vector capable of replication and transcription of the DNA. 

14 . DNA comprising one or more sequences of nucleotides 
encoding ail or part of one or more of: p!50, p90, C, E2 
and El of Cendehili virus, incorporated into an expression 
vector . 

20 15. A method of producing rubella virus comprising the 
steps of transcribing the DNA of claim 14 into RNA; 
transfecting cells with said RNA ; and, recovering rubella 
virus from the transfected cells. 

16. Rubella virus obtained by the method of claim 15 
25 wherein the DNA transcribed includes a sequence of 

nucleotides derived from a rubella virus strain other than 
Cendehili . 

17. A method of producing DNA encoding a recombinant cr 
chimeric rubella virus exhibiting the lack of 



:c nucleotides complementary 
of encoding an infectious 
cr having an attenuating 

11 . 
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arthrotropicity of Cendehili virus, comprising a step 
whereby : 

(a) nucleotides in Cendehili cDNA encoding viral 
structural protein are altered such that the protein so 

5 encoded increases immunogenic ity of a recombinant rubella 
virus comprising said protein; 

(b) nucleotides in the ncn- translated regions or 
non-structural protein region of cDNA for rubella virus 
other then Cendehili are altered to decrease 

10 arthritogenicity of a recombinant rubella virus coded for 
by the altered cDNA ; or, 

(c) cDNA for one or more of a Cendehili 
non- translated region, non- structural protein pl50, and 
non- structural protein p90 is joined to cDNA for a rubella 

15 virus other then Cendehili to produce DNA corresponding to 
a complete RNA genome of a chimeric rubella virus. 



18. An infectious clone for a rubella virus comprising a 
vector which includes cDNA corresponding to one or more 
portions of Cendehili genome selected from the group 

20 consisting of: a non- translated region, protein pl50, 
protein p90, protein C, protein El and protein E2 ; and 
wherein at least a part of cDNA in the infectious clone is 
cDNA for a rubella virus other than Cendehili. 

19. A method of producing rubella RNA comprising the step 
25 of transcribing the infectious clone of claim 18. 

20. Rubella RNA produced according to the method of 
claim 19 . 

21. A method of producing a rubella virus comprising the 
steps of transfecting cells with RNA produced according to 
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claim 1?, ana recovering rubella virus from the transfected 
cells . 

22. A rubella virus comprising a genome including a firs: 
portion which is equivalent to one or more ribonucleic 
5 a rids selected from the group consisting of: lendehiil 
non-translated RNA ; Cendehill pl50 RNA ; p90 RNA ; C RNA ; 
El RNA; E2 RNA; and wherein a second portion of the genome 
is equivalent to RNA of a rubella virus other than 
Cendehill . 

1C 23 . The virus of claim 22 wherein the virus other than 
Cendehill is RA27/3 . 

24 . The virus of claim 19 or 20 wherein the first portion 
is all of the Cendehill E' non- translated RNA, p!5C RNA, 
and p90 RNA. 

15 25. A Cendehill viral protein free of virus, selected from 
the group consisting of: p!50, p90, C, El and E2 , produced 
by expressing Cendehill cDNA encoding said protein from an 
expression vector . 

26 . Rubella cDNA, RNA, or a rubella virus having one or 
20 more nucleotide substitutions selected from the group 

consisting of: 37-C; 55-G; 118-T(or)U; 358-C; 2829-A; 

3060-G; 3164-C; 3528-T(or)U; 4530-T(or)U; 6611-C; 6770-G; 

6771-G; 7428-T(or)U; 8786-G; 8788-T(or)U; 8864-A; 

9180-T (or) U; 9254-A; and 9741-T(or;u, wherein the aforesaid 
25 numbering of the nucleotide substitution is with reference 

to Appendix 1, and wherein said substitutions occur in the 

same context as shown in Appendix 1. 

27. A rubella cDNA, RNA or viral genome that encodes a 
rubella protein selected from the group of proteins 
30 consisting of : pl5C/929/tyr ; p 1 5 0 / 1 0 0 6 / g 1 y ; p!5 C / 1 04 1 /his ; 
pl50/1162/val ; p9C/1496/ile ; C4/pro; C/S7/gly; E2/30b/val; 
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E2/413/ile; El/759/asp; El/785/met; El/890/leu; and, 
Ei/915/thr, wherein the aforesaid proteins are identified 
by reference to a strain- specif ic amino acid in Cendehill 
polyprotein and wherein the strain- specif ic arr.ino acid 
5 occurs in the same context as in the Cendehill polyprotein. 

28. Use of DNA incorporated into an expression vector 
according to claim 14 as a sub-unit vaccine. 

29. Use of DNA of claim 18 as a DNA vaccine. 
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RV 3' end complement 
Fl S'-CGOGAATTC! I M M I 1 I II i II 1 I I I II CTATACAGCAACAGGT 

EcoRI 

RV 5' start 

F2 S'-TCG\AGCT 1ATTTAGGTGACACTATA CAATGGAAGCTATCGGACCTCGCTTAGG> 
Hindi 11 SP6 

9 5TGCAGCGTTCGACGCAAACG- 2133-2153 

10 5'-TCCGAGTGCCGTTGCGATC- 2243-2262 
1 6 S'-GCGTTCTTG ATGTCGATATCGCG- 441 0-4431 
1 8 5'-CTCACTGATGTCTACACGCAGATG- 5281-5763 
46 5'-CAACCACCTCGGGAATGC- 3241-3260 
125 S'-TAGTCTTCGGCGCTTGG- 5747-5763 
251 S'-TTTGCCAACGCCACGGC- 2603-2618 



Figure 2 
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